Asymptotically pivotal statistic for surrogate testing with extended hypothesis 
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The method of surrogate data provides a framework for testing observed data against a hierarchy 
of alternative hypotheses. The aim of applying this method is to exclude the possibility that the data 
are consistent with simple linear explanations before seeking complex nonlinear causes. However, in 
recent time the method has attracted considerable criticism, largely as a result of ambiguity about 
the formation of the underlying null hypotheses, or about the power of the chosen statistic. In this 
communication we show that by employing a special family of ranks statistics these problems can 
be avoided and the method of surrogate data placed of a firm statistical foundation. 
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Since the framework of surrogate testing [l^, [2(| was 
proposed, it rapidly became a popular method to distin- 
guish linear stochastic noise from nonlinear irregular time 
series, and was widely applied in many fields [Tl|, [T^, HH 
albeit its intrinsic systematic problems, among which two 
of the most noticeable ones are: (1) "Wraparound arti- 
fact" [1, Qjl Ell j that is, adopting the Fourier transform 
to generate surrogates will result in artifact circular auto- 
correlation function; (2) Inconsistency for the surrogates 
to represent the null hypothesis 0, EJ • 

A few remedies were proposed to deal with the prob- 
lems of (1) and (2). For example, for problem (1), 
to avoid the Wraparound artifact, the idea of limited 
phase randomization was suggested 

0,113 ■ A more elab- 
orate solution (but more computation-intensive mean- 
while) was to adopt the method of simulated annealing 
[3]. For problem (2), to decrease the inconsistency be- 
tween the surrogates and the null hypothesis, the com- 
bination of linear and nonlinear discriminating statistics 
was recommended Q. 

A closer examination on problem (1) and (2) reveals 
that, both problems are mainly ascribed to the adoption 
of the Fourier transform. Thus by discarding this tech- 
nique one may circumvent those problems as well. As an 
example, a new algorithm was recently devised based on 
the linear superposition principle @, Q . 

This work is going to expose another important, yet 
less discussed problem. As pointed out in |20j, a piv- 
otal discriminating statistic could increase the accuracy 
of surrogate testing. Furthermore, with a pivotal statis- 
tic, the generated surrogates need not be constrained re- 
alizations, this therefore presents another possible strat- 
egy to relieve the problem of inconsistency between sur- 
rogates and the corresponding hypothesis. Here, a "piv- 
otal statistic" refers to the statistical measure that has 
the same distribution for all of the processes consistent 



with a composite null hypothesis [1, [l7]> H3] > where a com- 
posite null hypothesis means that the set of the relevant 
processes is not singleton. As an example, the correla- 
tion dimension was shown to be a pivotal test statistic 
for linear Gaussian noise [l6j |. 

Our focus is to derive a test statistic, namely the 
Wilcoxon signed rank statistic, which will be demon- 
strated to be pivotal for a family of linear stationary 
noise, including the linear stationary Gaussian noise con- 
sidered in [ll, |2(j, and asymptotically pivotal for the 
summation over realizations of linear stationary noise 
otherwise. Moreover, we will also show that, under our 
hypothesis (to be specified later), in general the Wilcoxon 
statistic will have a known asymptotical distribution (and 
even exact in certain situations), which is a considerable 
advantage for statistical inference in surrogate testing Q , 
in contract to the popular statistics in the literature (in- 
cluding the correlation dimension), whose distributions 
are often analytically untraceable due to the complica- 
tion in their computations. 

In this work we confine our discussion to stationary 
time series as well [IH Il9l . [20j . Our null hypothesis is to 
assume that the time series {xi\ under study is from a 
general p-th order linear autoregressive process, i.e., 
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with unknown coefficients {a«}f = g, regressive order p and 
innovation term ej. In general, based on the Wold's de- 
composition [l3| of a stationary stochastic time series, 
the innovation term in Eq. ([T]) is shown to be white. 
In particular, if is assumed to follow a Gaussian dis- 
tribution, then it returns to the scenario considered in 

To derive our test statistic, let us first consider a spe- 
cific family of the innovation terms. We call innovation 
terms {e^} jointly symmetric if there exists a constant 
fi such that {ei — /i} and {/i — e.;} follow the same joint 
distribution, i.e, their probability density functions sat- 
isfy that /(ei - n, e 2 - /«, ...) = /(/U - ei,/z - e 2 , ...) 0. 
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For instance, innovation terms with normal or uniform 
distributions are jointly symmetric. 

For a linear stochastic process {xi} with jointly sym- 
metric innovation terms, we use an ordinary least square 
(OLS) predictor 

x^ = Oi )0 + V P ai,A fc " J (2) 

* — '3=1 

to obtain the fc-step ahead prediction value x\ given the 
historical record {xi, Xi—i,X{—2, •••} up to time index i, 
where {ciij : j — 0, 1, are the coefficients estimated 

based on the history (up to time index i) with the fitting 
order p' . Consequently, we have the corresponding pre- 
diction error = ajj+fc — i\ . And it was proved that the 
prediction error is symmetric about zero, which is in- 
dependent of the choice of fitting order p' [l| . With this 
fact, e\ and —e\ shall share the same distribution so that 
their probability Pr(e* : > 0) = Pr(-e* > 0) = Pr(ef < 
0) = 1/2 (with the assumption that Pr(e* = 0) = in 
the sense of Lebesgue measure). Furthermore, let ij(ef) 
denote the indicator of e\ satisfying that Ii(e\) = 1 if 
e\ > and Ii(e!?) = —1 otherwise, then it can be shown 
that {li(ef)} is an independent series 

The signed rank statistic SR m can be derived from a 
set of m prediction errors {e^ : i = i si , i Sm }, where 
{i Sj : j — 1,2, ...,m} are admissible indices for a given 
time series. Without lost of generality, let us suppose 
the time indices i = 1,2, ...,m, then a test statistic can 
be constructed as follows 

SR m =Y / m hiel) x Si{rank{\e1\)), (3) 

* '2—1 

where {Si{rank{\e^ |)}™ 1 is the set of scores of {|e*|}™ : 
with rankQe^l) denoting the rank of the absolute value 
|e*| among the set {lefl}™!- In this work we choose 
Si(rank(\e^\)) = rank(\ef\) so that the test statistic is 
reduced to the Wilcoxon signed rank statistic [!, 0] 

W m = Y / m ixIi(^)- (4) 

For more detail about the signed rank statistic, the read- 
ers are referred to, for example, [8|. 

In Eq.Q W m is discretely distributed. Theoretically 
one can enumerate all of its possible values and thus ob- 
tain the full knowledge of its distribution. But in practice 
enumeration will be inefficient for large m. Thus for the 
sake of convenience, it is suggested to use the Gaussian 
distribution N(0, m(m + l)(2m + l)/6) for approxima- 
tion based on the central limit theorem, as adopted in 
our work. It was shown that, even for a small integer, 
say m = 6, one can still approximate the discrete distri- 
bution quite well [|[. 

The above deduction means that the Wilcoxon signed 
rank statistic W m is a pivotal test statistic for the linear 
stochastic processes with jointly symmetric innovations. 
But for a general linear stochastic process, the foregoing 
conclusion does not necessarily hold because of the pos- 
sible asymmetry of innovation terms. Nevertheless, we 



may apply the linear superposition principle, as will be 
described below, to reduce the asymmetry based on the 
central limit theorem. Through numerical experiments 
we will also show that, for nonlinear processes, the sum- 
mation over its realizations might exhibit different char- 
acters from those of linear stochastic ones. 

Without lost of generality, let us consider, for exam- 
ple, two linear stochastic time series {x s }^- and {x t }^j, 
produced by the same process described in Eq. J!}, 
then their summation {yh} l h=0 = {ijh ■ Vh = cix. ]+h + 
c 2Xk+h}h = o, for any coefficients c\ and C2, also follows a 
linear stochastic process 

ajVh-j + (cie s+h + c 2 e t+h ) (5) 

3 = 1 

with the innovation terms being (c±e s +h + C2£t+h)- This 
conclusion can be extended to general situations with, 
say, n time series. And then the drift and the innovation 

term become (X)"=i c ») a o arK i S™=i c * et i' wnere c i an d 
ti are the coefficient and time index associated with the 
i-th time series respectively. 

Since the white noise {e,;| under consideration is con- 
sidered as an independent [2l| sequence with finite vari- 
ance (because of the stationarity) , the central limit the- 
orem is applicable to the summation J27=i c i e ti, i-e., one 
may expect that the distribution of X)"=i c » e *; can ^ e a P" 
proximated by a normal distribution for large n. There- 
fore, even if originally e ti does not have a jointly sym- 
metric distribution, the asymmetry would be reduced by 
summing over a number of independent variables. Thus 
the Wilcoxon statistic W m is an asymptotically pivotal 
measure for the summation of a number of linear stochas- 
tic time series. In general, depending on the choice of 
scores in Eq. ([3]), one may derive a family of asymptoti- 
cally pivotal test statistics. 

In the following we will verify the above idea through 
few numerical examples, which are: (a) ARMA(1, 1) pro- 
cess Xi = 0.1 + 0.5xi-i + ti — 0.5ei_i, where ej follows 
the normal distribution iV(0, 1); (b) AR(2) process Xi = 
0.9xi-i — 0.3xi-2 + T]i with rji following the (asymmetric) 
beta distribution f(x) = x(l -x) 4 /B(2, 5), where B{2, 5) 
denotes the beta function. For comparison of the perfor- 
mance, we also include two nonlinear cases: (c) Henon 
map H(x, y) = (y + 1 — lAx 2 , 0.3a;); and (d) Rossler sys- 
tem (x, y, z) = {—y — z, x + 0.15y, 0.2 + xz — lOz), where 
for both cases the x components will be singled out for 
calculation. 

As the first step, we apply the Wilcoxon statistic W m 
to directly examine the above data generation processes 
(DGPs). At this stage, let us test the null hypothesis 
which assumes that the time series under examination 
is linear stochastic with jointly symmetric innovations, 
while later we will extend the hypothesis to a broader 
range as aforementioned. 

For each DGP, the hypothesis testing is conducted 
1000 times. Each time a different realization of the same 
DGP is produced by varying the initial condition(s). 
Each realization contains 2000 data points, among which 
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(a) (b) 
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FIG. 1: Prediction errors (p' = 8) of: (a) ARMA(l, 1) pro- 
cess, (b) AR(2) process, (c) Henon map, and (d) Rossler sys- 
tem, which are obtained by directly applying the OLS predic- 
tor to the time series without summing over different realiza- 
tions. 
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FIG. 2: Prediction errors (p' — 8) of summations over 20 
realizations of: (a) ARM A(l, 1) process, (b) AR(2) process, 
(c) Henon map, and (d) Rossler system. 



the last 200 points are taken out for out-of-sample one- 
step ahead prediction based on the OLS predictor in Eq. 
([2]), thus the number of prediction errors m=200. Recall 
that, for a linear stochastic process with symmetric inno- 
vation terms, theoretically the fitting order p' in Eq. ([2]) 
does not affect the symmetry of the prediction errors, so 
in our calculations we do not choose any particular fit- 
ting orders, instead we simply set p' — 6, 7, 8, 9, 10 for all 
cases. For demonstration, one example of such prediction 
errors is plotted in Fig. [T]for each DGP (p'=8). 

Since the distribution of W m is known (N(0, m(m + 
l)(2m + l)/6)), we can specify the false rejection rate 
a (i.e., the rate that one falsely rejects the null hypoth- 
esis), which is associated with two critical points C Q / 2 
and Ci^ a /2 f° r two-sided tests, where in general nota- 
tion C$ denotes the critical point at which the probabil- 



TABLE I: Rejections of the null hypothesis for 1000 realiza- 
tions of each DGP with different fitting orders. 



DGP Rejections of fitting order 


6 7 8 9 10 


ARM A( 1,1) 


56 55 53 52 53 


AR(2) 


78 83 80 80 81 


Henon 


73 64 69 61 67 


Rossler 


79 396 920 862 963 



TABLE II: Rejections of the null hypothesis for 1000 summa- 
tions over 20 realizations of each DGP with different fitting 
orders. 



DGP 


Rejections of fitting order 


6 7 8 9 10 


ARMA(1,1 


) 55 53 54 52 53 


AR(2) 


45 43 46 48 48 


Henon 


74 61 71 66 72 


Rossler 


220 820 936 974 



ity function Pr(x) of a random variable x satisfies that 
Pr(x < C$) = S. Therefore if the value of W m in a test 
falls on the interval [C a /2, Ci— a/2] > we do not reject the 
hypothesis, otherwise we reject. 

In our calculations, we set a = 5% for all cases. There- 
fore one would expect that, for a process consistent with 
the null hypothesis, e.g., the ARMA(1, 1) process, the 
actual rejection rate shall be the same as the nominal 
one 5% (with slight statistical fluctuations in practice). 

But for the processes not consistent with the null, this 
conclusion is not necessarily true. In fact, as indicated in 
Table HI the actual rejection rates of the AR(2) process, 
the Henon map and the Rossler system deviate from the 
nominal in all cases, therefore we could reject the null 
hypothesis, which means that those three DGPs cannot 
be linear stochastic processes with jointly symmetric in- 
novation terms. 

Next let us extend the tests to a broader range, i.e., 
we consider the null hypothesis assuming that the time 
series is linear stochastic governed by Eq. |T]), but now 
let us remove the assumption of joint symmetry previ- 
ously imposed on the innovation term e^. In general, 
the prediction errors are not necessarily symmetric, con- 
sequently the Wilcoxon statistic might not really follow 
the expected distribution, thus the actual rejection rates 
would deviate from the specified one. For example, see 
the AR(2) process (with the innovation term following 
the asymmetric beta distribution) demonstrated in Ta- 
ble D 

The above problem could be tackled by instead testing 
the summations over a number of different realizations of 
the same underlying process, as discussed previously. In 
our calculations, 20 realizations are summed over for each 
test (As before, there are 1000 tests in total). Before the 
summation each of them is multiplied by a random coef- 
ficient uniformly distributed on [1,2], therefore all of the 
realizations are nearly equally weighted to prevent over- 
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dominance of any particular sequence. For comparison, 
all the other computation settings are simply the same 
as those adopted at the first step, and the test results are 
reported in Table HT1 from which we could see that, sta- 
tistically the tests on the summations do not affect the 
rejection rate of the ARMA(1, 1) process. However, for 
the AR(2) process its rejection rates now become very 
close to the nominal one (with possible statistical fluctu- 
ations), which supports our argument. In contrast, the 
rejection rates of the Henon map and the Rossler system 
still deviate from the expected one, therefore one could 
reject the null hypothesis, which means that their real- 
izations are not generated by linear stochastic processes. 
Again, for demonstration we plot in Fig. [2] the prediction 
errors of the summations for each DGP (p' = 8). 



Up to now, we have presented two folds of advantages 
of the Wilcoxon signed rank statistic derived in this work. 
One is that, this measure has an explicitly enumerable 
distribution, therefore one could specify the exact false 
rejection rate for surrogate testing, which is an unreach- 
able property for many test statistics adopted in the rele- 
vant literature. The other is that, this statistic is pivotal 
for a special family of linear stochastic processes, includ- 
ing the widely studied linear Gaussian noise. Moreover, 
it also exhibits to be asymptotically pivotal for the sum- 
mations of realizations of general linear stochastic pro- 
cesses. Because of this property it becomes possible now 
to extend the null hypothesis of surrogate testing to more 
general situations. 
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